Patricia Crossno,
Sandia National Laboratories, pjcross@sandia.gov
Brian
Wylie, Sandia National Laboratories, bnwylie@sandia.gov
Andrew
Wilson, Sandia National Laboratories, atwilso@sandia.gov
John
Greenfield, Sandia National Laboratories, jagreen@sandia.gov
Eric
Stanton, Sandia National Laboratories, etstant@sandia.gov
Tim Shead, Sandia National Laboratories, tshead@sandia.gov
Lisa Ice,
Sandia National Laboratories, lgice@sandia.gov
Ken
Moreland, Sandia National Laboratories, kmorel@sandia.gov
Jeff Baumes, Kitware, jeff.baumes@kitware.com
Berk Geveci, Kitware, berk.geveci@kitware.com
Student team: [ ] YES [
x ] NO
If you answered yes, name the faculty who agreed to be your sponsor: Name, email address
We used "
The TITAN toolkit (http://www.vtk.org),
developed by Sandia National Laboratories in collaboration with Kitware, was used to create a custom tool - "Database
View" - that read the entity / relationship data from the database for
display using a variety of graphing techniques.
Users interact with DatabaseView by creating
SQL queries for nodes, edges, and selections - which are displayed as interactive
graphs with control over layout, color, labeling, etc. Interactive selection can be used to display
subsets of the main graph in a separate view with its own layout and display
parameters. Although it required a
degree of SQL expertise, building graphs in this manner through arbitrary SQL
queries gave us the flexibility and control needed for rapid exploration of
complex queries.
Provide a short
description of the tool(s) you used. Mention where and when it was developed.
Additional credit to developers of the tools can be provided here, and links to
find more information on the tool.
(250 words MAX)
Data set used:
[ x ] RAW DATA SET [
] PRE-PROCESSED SET
TOC:
Who – What – Where – Debriefing - Process - Video
Name
|
Associated
organization
|
Involved in
|
Involved in
terrorist activities? (Yes/No)
|
Most relevant source
files (5 MAX)
|
Cesar Gil
|
Gil Breeders
|
Yes
|
Yes
|
Chinchilla Dreamin', Week-of-Mon-20040705.txt_86,
Week-of-Mon-20030901-1.txt_36, Week-of-Mon-20030609.txt_4
|
Faron Gardner
|
Animal Justice League
(AJL)
|
Yes
|
Yes
|
Week-of-Mon-20030602-1.txt_66,
Week-of-Mon-20030609.txt_4, Week-of-Mon-20030818.txt_23, Chinchilla Dreamin'
|
Catherine Carnes
(nickname Collie)
|
Society for the
Prevention of Mistreatment to Animals (SPOMA)
|
Yes
|
Maybe
|
Week-of-Mon-20030526-2.txt_57,
Week-of-Mon-20031013.txt_4, Week-of-Mon-20030818.txt_23, Chinchilla Dreamin'
|
Luella Vedric
|
SPOMA
|
Maybe
|
Maybe
|
Week-of-Mon-20030526-2.txt_57,
Week-of-Mon-20040412-2.txt_13, Week-of-Mon-20031013.txt_4,
Week-of-Mon-20040119-1.txt_98
|
Rapper r’Bear (aka r’Bert – assuming
typo in news)
|
SPOMA; Shraavana (aka Shravaana)
|
Yes
|
No
|
Week-of-Mon-20030609.txt_7,
Week-of-Mon-20040119-1.txt_98, Week-of-Mon-20040308.txt_109,
Week-of-Mon-20040614.txt_94, Week-of-Mon-20040628.txt_61
|
Madhi Kim
|
Global Ways
|
Yes
|
No
|
Week-of-Mon-20040412-2.txt_13,
Week-of-Mon-20040308.txt_109, Week-of-Mon-20030526-2.txt_57
|
Abu Hassan (aka Professor
Assan)
|
|
Yes
|
No
|
ImportPermitsv3 BEST
WORKING COPY, Week-of-Mon-20031215-1.txt_91, Week-of-Mon-20040301-1.txt_75,
Week-of-Mon-20031013.txt_4
|
MN
|
|
Yes
|
No
|
meeting
|
Rosalind Baptista
|
|
Yes
|
No
|
hunt8, meeting, 20040630
|
Provide a text list of
events following the sample layout. Use
short description (i.e. one or 2 lines per event)
Provide what you think
is the best subset of events (20 events MAX)
|
Date |
Event description |
Most relevance
source files (5 Max) |
1 |
2001-2002 |
Mary Allen Ollesen ( |
Week-of-Mon-20040705.txt_86 |
2 |
3/1/2003 – 3/1/2004 |
Import permits obtained where Abu Hassan acts as
consignee for |
ImportPermitsv3 BEST WORKING COPY |
3 |
5/31/2003 |
Luella Vedric at a SPOMA fundraiser
says she is good friends with SPOMA director Catherine Carnes, dislikes
violent animal rights groups, and gets her tropical fish from Mr. Kim. |
Week-of-Mon-20030526-2.txt_57 |
4 |
6/5/2003 |
Three PetSmart stores in |
Week-of-Mon-20030602-1.txt_66,
Week-of-Mon-20030609.txt_4 |
5 |
7/16/2003 |
AJL sends beef and letter to Los Angeles Times,
claiming meat poisoned in 20 LA supermarkets.
No poisoned meat found. |
Week-of-Mon-20030714-2.txt_25 |
6 |
7/18/2003 |
Rosalind Baptista
photographed poaching chinchillas in |
hunt8 |
7 |
8/15/2003 - 9/1/2003 |
Cesar Gil starts Gil Breeders, a chinchilla
farm. Sells chinchillas at the West LA
Farmer’s Market every Sunday, 9am-2pm. |
Chinchilla Dreamin',
Week-of-Mon-20030901-1.txt_36 |
8 |
10/2003-1/2004 |
|
Week-of-Mon-20031027.txt_57, Week-of-Mon-20040105-1.txt_58,
cocaine hydro, Transport of Live Fish, DEA Files Updatev2 |
9 |
1/17/2004 |
SPOMA dinner at Millennium Broadway Hotel hosted by
Luella Vedric nets $230,000. Rapper r’Bear
performs and donates $80,000. |
Week-of-Mon-20040119-1.txt_98 |
10 |
10/13/2003 – 3/2/2004 |
Hassan uses Assan Circus
as traveling cover while trapping and illegally smuggling protected wildlife
from |
Week-of-Mon-20031013.txt_4, Week-of-Mon-20031215-1.txt_91 |
11 |
3/2/2004 |
Animal Defenders International secures CITES
confiscation order to seize Assan Circus animals in
|
Week-of-Mon-20040301-1.txt_75 |
12 |
3/14/2004 |
Madhi Kim tours Shravaana
Exotic Animal Sanctuary and spends day with rapper r’Bear. |
Week-of-Mon-20040308.txt_109 |
13 |
4/2004 |
Rosalind Baptista meets
with MN along |
meeting |
14 |
4/2/2004 |
Cesar Gil posts cartoon to blog
showing a chinchilla being infected by exposure to a small rat-like animal. |
Chinchilla Dreamin',
20040402 |
15 |
4/17/2004-4/18/2004 |
|
Week-of-Mon-20040412-2.txt_13 |
16 |
4/2004 – 6/20/2004 |
Rapper r’Bear adds 500
new animals to Shravaana, including short-tailed
chinchillas. |
Week-of-Mon-20040614.txt_94 |
17 |
6/30/2004 |
Cesar Gil posts cartoon to blog
saying that Senorita Baptista will be delivering
infected poached chinchillas to US customers. |
Chinchilla Dreamin',
20040630 |
18 |
6/30/2004 |
r’Bear taken to hospital with bumpy face
and fever. Shravaana
Open House cancelled. |
Week-of-Mon-20040628.txt_61 |
19 |
7/7/2004 |
Monkeypox outbreak hits |
Chinchilla Dreamin',
Week-of-Mon-20040705.txt_83 |
20 max |
7/24/2004 |
Two dead in monkeypox
outbreak. Cesar Gil sought and believed
to have fled country. |
Week-of-Mon-20040705.txt_86 |
Follow this example
layout. Use only one-line per item.
|
Location |
Description |
Most relevance
source files (5 Max) |
1 |
|
Site of monkeypox
outbreak and AJL terrorist incidents. |
Chinchilla Dreamin',
Week-of-Mon-20040705.txt_83, Week-of-Mon-20040705.txt_86,
Week-of-Mon-20030602-1.txt_66, Week-of-Mon-20030714-2.txt_25 |
2 |
Shravaana Exotic Animal Sanctuary (near |
Madhi Kim meets with r’Bear, many endangered species added to animal
sanctuary, and r’Bear cancels open house when he
gets sick. |
Week-of-Mon-20040308.txt_109,
Week-of-Mon-20040614.txt_94, Week-of-Mon-20040628.txt_61 |
3 |
|
Baptista meeting with MN |
meeting |
4 |
|
Contaminated packaging in tropical fish shipments, |
Week-of-Mon-20031027.txt_57,
Week-of-Mon-20040105-1.txt_58, Tropical Fish Importers,
Week-of-Mon-20040412-2.txt_13 |
5 |
|
Chinchilla trapping location and probable initial
chinchilla exposed to monkeypox. |
hunt8 |
Include your written assessment of
the situation (between 1000 and 2000 words)
This narrative should describe the
plot(s) and subplots(s) and how people, motivations, activities and locations
are part of the plot. Include in your narrative the relationships of the
various players. If there are
uncertainties, you can suggest possible next steps to clarify those uncertainties.
(NOTE: here there is no need to
explain how the tool helped you, focus on convincing us that you UNDERSTAND the
situation).
Main Plot
The main
plot is to create a monkeypox outbreak carried by pet
chinchillas to halt demand for, and poaching of, endangered wild
chinchillas. The central actor in this
plot is Cesar Gil, a chinchilla lover, biologist, animal rights activist, and blogger. The plot
and its motivation are described in a series of cartoons called ‘Chinsurrection’ on Gil’s blog,
Chinchilla Dreamin’.
News articles, with dates corresponding to those in the blog, confirm that the plot is executed in the
The monkeypox strain seen in
There are
varying degrees of certainty about the players in various plots. Although Gil’s role is certain, the roles of
his friends Collie and Faron are less clear. Through
news articles, we find that they are ‘Collie’ Catherine Carnes, director of the
Society for the Prevention of Mistreatment to Animals (SPOMA), and Faron Gardner, spokesperson for the Animal Justice League
(AJL). Although
Continuing
to explore the web of relationships expanding from Gil, Collie Carnes is good
friends with Luella Vedric, a
Although
the rapper r’Bear appears similar to Vedric in his relationship with Kim, r’Bear
falls into the smuggling group because he is illegally purchasing endangered
wildlife. This is established by r’Bear’s new short-tailed chinchillas at Shraavana, followed by the open house cancellation when r’Bear is hospitalized with symptoms resembling monkeypox. Since r’Bear’s chinchillas are short-tailed and no short-tailed
chinchillas have entered the country through any of the import permits, these
chinchillas must have been illegally obtained.
However, this presents a problem in explaining how r’Bear’s
chinchillas get infected, since he is near San Diego and he has a direct
connection to Global Ways, but no direct link to Cesar Gil. In fact, there are really two different
outbreaks. r’Bear’s
hospitalization happens a full week before the outbreak in
The person
with the best access would be the photographer who took pictures of Rosalind Baptista in the Choapa Valley,
Chile. Knowing where the traps are
located would allow the photographer to infect one of the trapped chinchillas
before Baptista collected it. That one chinchilla would then infect all the
chinchillas in that shipment, which closely resembles the scenario shown in Chinsurrection.
Although we have no evidence to connect the photographs with any of the
players, logical candidates would be either Carnes or Vedric
since they are both involved in investigating and stopping wild animal poaching
and smuggling. Vedric
can probably be eliminated since she has stated that she’s “against violence
toward any animal and people are just another kind of animal”. So, unless Vedric
is lying, she does not seem likely to infect either a chinchilla or a person
with monkeypox.
Also, Gil’s detailed and timely information about Baptista
is best explained by Carnes being the source, since they have a direct link. On November 7, 2003 Gil writes, “…and now,
the trappers are back in
Animal Smuggling Subplot
Abu Hassan
is using the Assan Circus as a cover for illegal wild
animal trapping and smuggling as the circus travels throughout
Drug Smuggling Subplot
Animal Justice League
Subplot
The
terrorist activities of the AJL in the
Data Preparation
We
extracted tagged entities and relationship tables from the data by running it
through
Our
strategy is to use all entities, terms, source documents, and known
relationships from the database to generate a graph that casts as wide a net as
possible, and then to probe that graph with focused queries to generate subgraphs containing relevant pieces of the puzzle. We used a custom Sandia-developed tool, DataBaseView (see tool description above), to do this. The tool’s full interface is shown in Figure
1 below. Although the full interface has
a large number of windows, unneeded ones can be hidden, while the others are
expanded to make full use of the available screen real estate. A detailed explanation of the tool startup
can be found in the video. Once we have
connected to the database and generated our wide-net graph (a corner of which
can be seen in the graph to the left of the one enclosed by the pink box), we
do a focused query that returns a list of entities and terms (the list directly
above the pink oval). Each of these is
then individually selectable for inclusion (or exclusion) as relevant nodes in
a subgraph.
The subgraph is generated by taking a
neighborhood of nodes and edges around each of the selected nodes from within
the wide-net graph. A neighborhood of
one includes all nodes directly connected to the selected nodes by an
edge. A neighborhood of two includes all
nodes that can be reached by traversing no more than two edges. Larger neighborhoods are not useful to generate
because the volume of extraneous nodes and overlapping edges hide whatever
useful information may be represented in the graph. The neighborhood is set through the parameter
field in the pink oval. The button to
its left activates the graph generation.
Alternatively, nodes can be selected through a non-SQL interface (green
box) that lists
Figure 1: DataBaseView showing the full interface with all query
windows, neighborhood distance (pink ellipse), image source file selection
(blue box) with associated image display (blue arrow), and the solution graph
(pink box).
Our general
approach was an iterative one. Starting
from some conceptual thread or idea, we perform the following steps repeatedly
to extract an expanding set of entities and relationships to form a solution
graph:
·
identify
a thread to investigate
·
do
an SQL query on entities and terms involved in that thread
·
select
relevant entities and terms from a list returned by the query
·
select
a neighborhood distance value
·
generate
a graph containing the query result
·
select
source documents in the graph for examination
·
view
source documents to identify next query thread
In the
following discussion, we present a mostly linear path that attempts to cover
our plot discovery process. However, in
reality, our discovery process was highly non-linear. We explored many dead-ended theories that we
will spare you from hearing about. We
revised our ideas about the relationships between people and their roles. However, we did immediately latch onto the
idea of a monkeypox plot because we came upon the
term while we were manually entering the text from the Chinsurrection
cartoons. That led us to our first query
thread, which was monkeypox. By this, we mean that we constructed an SQL
query asking for all entities or terms in the database containing ‘monkeypox’. The
resulting graph using a neighborhood of one was disjoint and not very
interesting, so we expanded to a neighborhood of two, which generated the graph
shown in Figure 2.
Figure 2:
Closeup of solution graph for monkeypox query with a
neighborhood of two. Notice that entity
nodes are shown in red, text-based files are green, images are cyan, and terms
are in various shades of blue.
In Figure
2, the original Chinsurrection image (20040707.jpg)
appears in the lower part of the graph connected to the term monkeypox (both combined in the bottom pink circle) and the
Chinchilla Dreamin’ htm
source file (circled in green). In the
pink ellipses to either side of the green circle, we see the entities
Next we
explore chinchillas as a thread and once again look at a neighborhood of two
around the query results to get a broader view of things. The resulting graph is shown in Figure 3.
Figure 3:
Query on chinchillas with a neighborhood of two.
The monkeypox outbreak is shown in the upper left portion of
the graph (large pink circle). We also
see references to poaching and
Thinking
that Rosalind Baptista might be part of the plot, we
queried her, but nothing new was found.
We also queried
Figure 4:
Cesar Gil added to the monkeypox query with a
neighborhood of one.
We select
all four of the source text files shown in the graph. These are listed in the upper right
corner. Of these, we display the article
Week-of-Mon-20030609.txt_4 in the window in the lower right corner. We have indicated the article’s location in
the graph with the pink circle. In the
graph, the article connects Cesar Gil and
So now, we
query on Faron Gardner and get the graph shown in
Figure 5.
Figure 5:
Solution graph after adding Faron Gardner to the
query.
The new
solution graph now includes the Animal Justice League (upper pink circle)
within its neighborhood. Animal Justice
League is linked directly to Faron Gardner (lower
pink circle). We have selected three of
the articles in the graph (the green nodes highlighted in white within the
yellow circle) that are linked to either Faron
Gardner or Animal Justice League. From
the list of these selected articles shown in the upper right window, we have
displayed Week-of-Mon-20030602-1.txt_66 for reading. Here we find the details of the PetSmart (middle green circle) raids and are able to
connect AJL with the Animal Justice League (top and bottom green circles),
which we use to do an entity resolution between the tags.
Seeking to
expand this thread and look for connections between the AJL and the monkeypox plot, we add Animal Justice League and PetSmart to our query. However, a neighborhood of one does not give
us any particular insights, so we expand to a neighborhood of two, a zoomed-in
section of which is shown in Figure 6.
Figure 6:
Solution graph showing addition of Animal Justice League and PetSmart.
We focus on
the area (pink circle) near the Animal Justice League (blue circle), where we
find the article Week-of-Mon-20030526-2.txt_57 connecting Catherine Carnes to
AJL and the Society for the Prevention of Mistreatment of Animals. Selecting this article for reading, we
discover that Catherine Carnes is nicknamed Collie (which connects her to Cesar
Gil since he lists Collie as a friend in his blog)
and that she is director of SPOMA.
Additionally, the article further expands the social network of the
animal rights activists by establishing a connection between Carnes and Luella Vedric, who says that Collie is one of her dearest
friends. We link Collie and Catherine
Carnes, along with SPOMA and the Society for the Prevention of Mistreatment to
Animals, to our entity resolution table.
Since Vedric is not within our neighborhood,
we need to add her directly to the query.
Plus, adding Catherine Carnes and SPOMA might further expand our social
network or uncover more animal rights terrorist activities, so we add all three
to our query. A closeup of the query
result is shown in Figure 7.
Figure 7:
Closeup of query results for adding Luella Vedric,
Catherine Carnes, and SPOMA
By adding Vedric and Carnes to the graph (and reducing the clutter by
reducing the neighborhood back to one), we can now see a fairly clear
representation of the people (Gil, Gardner, Carnes, and Vedric),
organizations (AJL and SPOMA), locations (Los Angeles and New York), source
articles (Chinchilla Dreamin’ and a number of news
articles), and the relationships between them in Figure 7. Selecting the two articles that link Carnes
and Vedric (pink circle), we read
Week-of-Mon-20041013.txt_4 and see that Vedric and
Carnes are tracking the Assan Circus in
Figure 8:
Query results for Assan Circus with a neighborhood of
two.
In Figure
8, the Assan Circus (middle pink circle) appears
connecting two articles. The one on the
right is the article that we have already read, which describes Vedric and Carnes’ (both circled in pink) efforts to track
the circus and stop abuses. The article
on the left, Week-of-Mon-20040301-1.txt_75, connects the circus to someone
named Abu Hassan and references
Figure 9:
Zoom in on query results, graphed with a neighborhood of one, after adding Abu
Hassan and
Figure 9
shows that Abu Hassan and Zimbabwe (two middle pink circles) are referenced in
the Import Permits spreadsheet and that he is connected to an animal import
deal with a company called Global Ways (the column of pink circles on the left
side of the graph). Viewing the Import
Permits spreadsheet, we discover that Abu Hassan is the consignee for every
import permit associated with
Figure 10:
Closeup of query results for
The import
permits that
This image
also introduces two new people, Madhi Kim and r’Bear (blue circle), who are connected to
Figure 11
Gambian rat reference.
We will
stop with our exploration at this point, since the use of the tool should be
clear by now. We were unable to
definitively resolve our two loose ends, the source of the monkeypox
and how the chinchillas were infected.
Since rodents are carriers, we tried a dead-ended query thread on
laboratory animal abductions and lab break-ins, which we will skip over
here. Then we found the comment about
Gambian rats (green circle in text window in Figure 11) in the article
Week-o-Mon-20040705.txt_86 (green circle in the graph view in Figure 11). The reference to the Gambian rats
mysteriously dying and the fact that Ollesen is a
business owner in
TOC:
Who – What – Where – Debriefing - Process - Video